Models for Approximating Basilar Mem- 
brane Displacement — Part II. Effects of 
Middle-Ear Transmission and Some 
Relations between Subjective 
and Physiological Behavior 

By JAMES L. FLANAGAN 

(Manuscript received December 26, 1961) 

This report presents the second half of results of a study on the peripheral 
ear. There are two objectives: (1) to derive computational models for ap- 
proximating the mechanical displacement of the basilar membrane when 
the sound pressure at the eardrum is known, and (2) to demonstrate certain 
relations between subjective behavior measured experimentally and physio- 
logical behavior calculated from the models. The report describes a rational 
function approximation of middle-ear transmission. This result, in combi- 
nation with previously derived models for the inner ear, permits an analytical 
approximation of basilar membrane displacements in both apical and basal 
regions. Because the models are rational functions, they can, if desired, be 
simulated by lumped-constant electrical networks. Their computational 
tractability also permits straightforward approximations to temporal and 
spatial derivatives of displacement. Relations between computed membrane 
displacement and subjective behavior are illustrated for several psycho- 
acoustic phenomena, namely pitch perception, binaural lateralization, bi- 
naural time-intensity trade, threshold discrimination, and pure-tone mask- 
ing. The extent to which some of these phenomena can be correlated with, 
identified in, and predicted by the mechanical operation of the peripheral 
ear appears to be substantial. 

Part I of this report 1 described three analytical models for approxi- 
mating the displacement of the basilar membrane when the human ear 
is stimulated by sound. These models were valid for points lying roughly 

in the apical half of the membrane, that is, for frequencies less than about 

959 



960 



THE BELL SYSTEM TECHNICAL JOURNAL, MAY 1962 




p(t)« 





MIDDLE 
EAR 


x(t) 


BASILAR 
MEMBRANE 




X(S) 


V s ) 

P(SJ 


G(S) 

x(s) _ 

P(S) 


F^s) 
= G (S) • F x 



»y x (t) 



(S) 



Fig. 1 — Schematic diagram of peripheral ear and functional relations between 
acousto-mechanical quantities. 

1000 cps. Over this frequency range the elastic effects of the middle ear 
predominate, and the displacement of the stapes footplate is essentially 
proportional to, and in phase with, the sound pressure at the eardrum. 
At higher frequencies the mass and viscous properties of the middle ear 
become important, and the displacement transmission to the stapes is 
no longer constant with frequency. Applicability of the previously de- 
rived models to this range of frequencies depends upon being able to 
account for middle-ear transmission. This report describes an effort to 
derive a computational model for middle-ear transmission and to ex- 
amine its relationship with the models for membrane displacement. 
Subsequent to this, an attempt is made to relate the mechanical opera- 
tion of the ear, as described by the models, to several facets of subjective 
auditory behavior. 

I. EFFECTS OF MIDDLE-EAR TRANSMISSION UPON MEMBRANE DISPLACE- 
MENT* 

The physiological functions to be considered are illustrated schemati- 
cally in Fig. 1. p(t) represents the sound pressure at the eardrum as a 

* The material in this section was presented orally before the 60th meeting 
of the Acoustical Society of America, San Francisco, California, October, I960. 
The abstract appears in J. Acoust. Soc. Am. 32, 1960, p. 1494. 
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function of time; x(t) is the equivalent linear displacement of the stapes 
footplate;* and yi{t) is the displacement of the basilar membrane 
(cochlea shown uncoiled) at a distance / from the stapes. In terms of 
frequency-domain (Laplace) transforms, the middle-ear transmission is 
represented by G(s) and the stapes-to-membrane transmission by F t (s). 

In deducing approximations to these functions, the peripheral ear is 
assumed both to be mechanically linear over the range of interest and to 
constitute a passive system. A passive system is stable by definition. It 
has no normal modes whose amplitudes increase indefinitely with time. 
The functions G(s) and Fi(s) can therefore be approximated by rational 
functions of frequency whose coefficients are real and whose poles and 
zeros are either real or occur in complex conjugates. The functions can 
have no poles with positive real parts and only simple poles with zero 
real parts. 

The earlier paper essentially treated functional approximations to 
Fi(s) (that is, middle-ear transmission was assumed constant with fre- 
quency, or, in the present notation, G(s) = 1). Two of the previously 
derived models will be useful in the present discussion. They are the first 
and third which, according to the notation used earlier, were called 
F\(s) and F$(s). For convenience they are reproduced here and are: 



F 1 (s) = Cl /3 
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(2) 
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where 



8 = (<r + jw) 
ft 



is the complex frequency, 

is the radian frequency to which the point I distance 
from the stapes responds maximally, 

is a factor which matches the physiologically meas- 
ured variations in peak amplitude of displacement 
with resonant frequency /3/f, 



e 4 ^' is a delay factor of 3ir/4/3j seconds which brings the 

* In man, the stirrup does not move longitudinally as a planar piston but usu- 
ally exhibits some rotational motion. x(t) is taken here as the volume displace- 
ment of the footplate divided by its area. 

f The present form of this factor is applicable only to the frequency range 
below 1000 cps. Here, as previously discussed, 1 the value of r = 0.8. A minor 
modification will be made in this factor presently to make it appropriate for higher 
frequencies. 



962 



THE BELL SYSTEM TECHNICAL JOURNAL, MAY 1962 



)CO 



XX - - +/3^ 

hi 



l«l 



-X- 
-7 



xxx - - +/3\, 



F,(S) 



-H 



foci 



X 



XX ---/3 L 



XXX 4- -/3-l 



F,(S) 



Fig. 2 — Pole zero diagrams for two functional approximations of F(s). 

phase delay of the model into line with the phase 
measured on the human ear. This factor is princi- 
pally transit delay from stapes to point I on the 
membrane*. 

The membrane characteristics are therefore approximated in terms of 
the poles and zeros of these two functions. Because the resonant proper- 
ties of the membrane are nearly constant Q in character, the real and 
imaginary parts of the pole frequencies are related by a constant factor, 
i.e., fit = kai . For the present models, the best fits to the experimental 
data are obtained for the following choice of parameters: 



ForFx(s):- = 0.1 to 0.0 1, 

Pi 



= 1.0 



ForF 3 (s):^ = 1.7 



(3) 



* = 2.0. 

Oil 

Therefore, to within a multiplicative constant, the imaginary part of 
the pole frequency /3/ completely describes the model. The pole-zero 
diagrams for the two models are shown in Fig. 2. 

The real frequency responses of the models are evidenced by letting 
s = jo. If frequency is normalized in terms of f = w/fii , then the rela- 



* At low frequencies the phase of the model departs somewhat from the experi- 
mental data. See the discussion of this point in Ref. 1 and also in J. L. Flanagan 
and C. M. Bird, Minimum Phase Responses for the Basilar Membrane, J. Acoust. 
Soc. Am. 34, 1962, p. 114. 

t See earlier comments about fitting phase response. 
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five phase and amplitude responses of Fi(j£) and F 3 (j£) are shown in 
Fig. 3 for the parameters stated in (3). 

The inverse Laplace transforms of (1) and (2) are the displacement 
responses of the membrane to an impulse of displacement by the stapes. 
These representations will also be useful in the present discussion. If 
the mathematics is carried out the inverse transforms are found to be: 

fi{t) = cj0i 1+r [ [0.033 + 0.360/3,U - T)]e 2 sin fa(t - T) 

Pl(t-T) 

+ [0.575 - 0.320/3,0 - T)]e 2 cosftO - T) - 0.575e _ " ,( ' _!r) } ( 4 ) 

for t ^ T; t/0i = 0.1 
= 0; for t < 1\ 

and 

Mt) = 3*p Mt - T)fe- 0lU - T)n - 7 sml3 l (t - T) for * ^ T 
6 



(5) 



= 



for t < T, 



where the delay T — 3ir/4/3j , as previously stated. In the earlier paper, 
the simplicity of f 3 (t) was the main reason that F 3 (s) was considered 
as an approximation to the experimental frequency domain data. A plot 




0.3 0.4 0.5 0.6 0.8 1.0 2 

NORMALIZED FREQUENCY, t = ^7r 

Fig. 3 — Amplitude and phase responses for two I''{s) models. 
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Fig. 4 — Impulse responses for the membrane models. These responses are the 
inverse transforms of the frequency data in Fig. 3 and represent membrane dis- 
placement caused by an impulse of stapes displacement. 

of the responses (4) and (5), on a relative amplitude scale and with de- 
lay equalized, is shown in Fig. 4. The absolute time origins for the two 
traces are to the left of the relative origin by 1 .9 and 1 .5 radians respec- 
tively. 

As indicated above, the factor Pi 4+r in (1) and (2) has a form appro- 
priate to the frequency range below 1000 cps. If the membrane models 
are to be used at higher frequencies, this factor should be modified ac- 
cording to data given by Bekesy 2 on the peak membrane displacement 
as a function of frequency (see Fig. 4 in Ref. 1). For constant sinusoidal 
displacement of the stapes, Bekesy's data indicate that the peak mem- 
brane displacement increases at about 5 db/octave up to around 1000 
cps, and then tends to flatten off and become roughly constant (at least 
up to about 2000 cps). 

This amplitude variation can be accounted for by altering the multi- 
plicative amplitude factor to fr 4+r (2ir-1000//3, + 2ir-1000) r . The modifi- 
cation causes the peak response (of the curve shown in Fig. 3) to rise 
at about 5 db/octave below 1000 cps, and to flatten off above this fre- 
quency. At low Pi frequencies the amplitude factor is the same as before 
if the constant C\ is readjusted by multiplying it by 2 r . With this minor 
modification, then, the functional approximations to Fi(s) are appropri- 
ate for use at frequencies higher than 1000 cps. 
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1.1 A Model for Middle-Ear Transmission 

To account for middle-ear transmission one would like an analytical 
specification of the stapes displacement produced by a given sound pres- 
sure at the eardrum for all frequencies of interest. Quantitative physio- 
acoustical data on the operation of the middle ear are very sparse. The 
data which are available are due largely to Bekesy 3 and to Zwislocki. 4 
By considering the topology of the mechanical circuit and the values of 
elastic, mass, and viscous constants measured in physiological prepara- 
tions, it is possible to deduce information about the middle-ear trans- 
mission. Zwislocki used this approach to develop an analog electrical 
circuit for the middle ear in which voltage is analogous to pressure, and 
current to volume velocity. The circuit includes ten components rep- 
resenting the acousto-mechanical elements of the middle ear. Seven of 
the elements are energy storage elements. 

Using the constants suggested by Zwislocki, we measured the transfer 
characteristics of the middle-ear circuit when terminated in an imped- 
ance analogous to the input mechanical impedance of the cochlea. For 
a constant pressure at the eardrum, the amplitude and phase responses 
of the stapes displacement are shown by the curves in Fig. 5.* 

Although the characteristic equation corresponding to Zwislocki 's 
analog circuit is of seventh degree, the stapes displacement can be 
analytically approximated reasonably well by a function of third degree. 
(As discussed in the earlier paper, the criterion of fit is again taken as an 
intuitive one.) Such an approximating function is of the form: 

G(s) = U + «)[(. + „)> + (,'] ■ (6) 

where c is a positive real constant. [When combined with Fi(s), the 
multiplying constants are chosen to yield proper absolute membrane 
displacement. For convenience, therefore, consider Co = a{a + b 1 ) so 

* After the present work was carried out, an excellent paper by A. Miller (Net- 
work Model of the Middle Ear, J. Acoust. Soc. Am. 33, 1961, p. 168) appeared in 
which analogous electrical circuits for the middle ear are deduced on the basis of 
input impedance measurements at the drum and the middle-ear topology. For a 
comparison with Zwislocki 's data (which we had already used), we constructed 
several of M0ller's circuits and measured their transfer characteristics. Although 
their frequency responses differ in fine detail, the results of Zwislocki and M0ller 
agree in the gross aspects of the transmission characteristics. As do Bekesy 's 
earlier results, both sets of the latter data suggest some uncertainty and varia- 
bility in the middle-ear transmission, particularly in regard to the frequency at 
which the transmission begins to diminish appreciably. Apparently the function 
differs among individuals. One of the main objectives of the present paper, how- 
ever, is to demonstrate a computational technique which has been found useful in 
explaining certain auditory phenomena. Whenever physiological data are in- 
proved and extended, the results can easily be incorporated into the analytical 
technique presented here. 
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Fig. 5 — Functional approximation of middle ear transmission. The solid 
rves are from Zwislocki 4 and the plotted points are amplitude and phase values 



cur 

of G(s). 



that the low-frequency transmission of G(s) is unity.] When the pole 
frequencies of G(s) are related according to 



b = 2a = 2tt(1500) rad/sec, 



(7) 



the fit to Zwislocki's data is shown by the plotted points in Fig. 5. 

The inverse transform of (6) is the displacement response of the stapes 
to an impulse of pressure at the eardrum. It is obtained easily and will 
be useful in the subsequent discussion. Let 



G(s) = G l (8)GM, 



where 



(?,(«) = 



Co 



G 2 (s) = 



1 



s + a' 
The inverses of the parts are: 

gi(t) = c Q e~ al 

The inverse of G(s) is then 

git) = / gMgtit - t) dr, 



s + a) 2 + 6 2 



gM = -r- sin ht 



(8) 



(9) 
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FiK. 6 — Displacement response of the stapes and its time derivative to an 
impulse of pressure at the eardrum. 



or 



git) = coV-\l ~ «»W) = c -^ 
b b 



(1 - cos 60- 



(10) 



Also for use at a future point in the discussion we note that the time de- 
rivative of the stapes displacement is: 



g(t) = 



(y 



(2 sin bt + cos bt - 1). 



(11) 



Plots of g(t) and g(t) are shown in Fig. 6. 

1.2 Combined Response of Middle Ear and Basilar Membrane 

The combined response of the models for the middle ear and basilar 
membrane is simply: 
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EM = G(s)F l (s) 

hid) = g(t)*Mt). 

To simplify the computations and to illustrate both results, the response 
for model Fi(s) will be computed in the frequency domain and that for 
F 3 (s) will be computed in the time domain. 

1.2.1 Inverse Transform of Hi(s) 

Disregarding for the moment the constant delay and amplitude terms, 
which can be resupplied at the end if needed, the problem of transform- 
ing Hi(s) = G(s)Fi(s) amounts to computing the inverse of: 

u'l \ - X 1 s + c * ( V i) 

Hl{S) ~7+T(s + a) 2 + b 2 's + 7 [(s + a) 2 + 2 ] 2 ' " 

Expand Hi(s) as partial fractions: 

tt'(\ A Bs + c D ■ E(s) () 

^ ^ ~ 7+^ + ( S + a y + V + 5 + 7 + [(■ + ocY + ffP 



where 



, 1 e — a 1 , 

A -p- 7 -3T^- [(a _ a)2 + ^p' 7F= 

B = 2 Re JS' 

1 ■ — a +j& 



R ' _ _i 

26 2 t ~ a + i& [(« " «) 2 + jS 2 - 6 2 + j'26(a - a)] 2 

C = [o(2ReB') - 6(2 ImB')] 

D = 1 - 1 U^ 

a - t (a - 7) 2 + & 2 [(a - t) 2 + P\* 

7i'(s) = («<• + ens + a 2 s 2 + a 3 s). 

On the basis of the previous findings the problem is particularized to the 
conditions: 

(9 = 2a Also let: v = P/b 

o = 2a y*a (is) 

7 = jS ^ 6. 

e = 
If the arithmetic is followed through, the constants are found to be: 



A - - 



B' = 
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¥(2rj - l)(1.25i7 z - 0.50!? + 0.25 ) 2 
(0.50 - ./1.00) 



2^(77 - 0.50 + il.00)[(1.25 I? 2 - 0.50»7 - 0.75) + j( v - l.OO)] 2 
B = 2 Re B' 
C = bOXe B' - 2 Im B') 

D =l ! 

//• (r, - 0.50) (r, 2 - * + 1.25) ( 1.25) V 

and the coefficients of E{s) are found to be: 

a,, = - v V(3.\2 v bA + l.25vC + IMbD) 



a, = 



-ft 2 A(3.50»/ a - v + 0.25) + B(3.50ij 2 - 2.00r? - 0.25) 

« 2 = -[,!/>( 2*7 - 0.50) + B6(2ij - 1 ) + Dbv + C] 

a, = -{A + B + />). 

Although somewhat involved numerically, the inverse transform of 
Hi'(s) can now proceed termwise as indicated in (14). The basic pro- 
cedure from this point has already been indicated in the appendix of 
the earlier paper. When the details of this instance are carried through, 
the result is: 

/».,'(/) - A e~ bm + B e~ bm (cos bt - 0.50 sin bt) 

4- C l (e- blli sin bt) + 1) v -*' + (e-*" 2 sin v bt) 
b 

'(W 3 [ a ° ~ ai ^ + fl2< >- 2; W) - ai(1^3vV)1| 

+ (vbt e~*" 2 sin v bt) -(-4t, [a% ~ <hvb - a*(0.25v 2 b*) 
[2y-b- J 

+ oa c-"" /L ' cos vbt + (vbt e~* m cos vbt) l~ 



(17) 



517*0* 



— a«> + ai — 



+ a 2 (0.75TjV) - a 8 (1.38T/ 3 6 8 ) 






} ; for < ^ 0. 



hi(t) is obtained from h x '(t) by resupplying the constant delay T = 
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Fig. 7 — Displacement responses of apical, middle and basal points on the 
membrane to an impulse of pressure at the eardrum. These are computed from the 
inverse transform of [G(s)/''i (&•)]• 

3tt/4/3 and the multiplicative amplitude constants; that is, by letting 
t = (t - T ) and by multiplying h'(t - T) by 

[coCi^ J 4 " H "(2ir-1000/ ( 9, + 2ir- 1000 ) r ], where r = 0.8. 

The form of the impulse response is thus seen to depend upon the 
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parameter ij = 0/6. Values of q < 1.0 refer to (apical) membrane points 
whose frequency of maximal response is less than the critical frequency 
of the middle ear. For these points the middle-ear transmission is es- 
sentially constant with frequency, and the membrane displacement is 
very nearly indicated by fi(t) in (4). On the other hand, values of 
7] > 1.0 refer to (basal) points which respond maximally at frequencies 
greater than the critical frequency of the middle ear. For these points 
the middle-ear transmission is highly dependent upon frequency and 
would be expected to influence strongly the membrane displacement. To 
illustrate this point, (17) has been evaluated for ij = 0.1, 0.8, and 3.0. 
The result, with the delay resupplied, is shown in Fig. 7. 

For an impulse of pressure delivered to the eardrum, the three solid 
curves represent the membrane displacements at points which respond 
maximally to frequencies of 150, 1200, and 4500 cps. Each of the plots 
also includes a dashed curve. In Figs. 7(a) and 7(b), the, dashed curve 
is the membrane displacement computed by assuming the middle-ear 
transmission to be flat with zero phase. [This is simply the response 
£~'F](s).] In Fig. 7(c) the dashed curve is the time derivative of the 
stapes displacement, g(t), taken from Fig. (5. The suggestion is that in 
the basal region, the form of the membrane displacement is very similar 
to the derivative of the stapes displacement. This apparently is the case, 
and this point will be considered again presently. 

1.2.2 Inverse Transform for ffsis) 

If, as in the previous section, delay and scale constants are temporarily 
disregarded, the inverse transform for [(r(s)F- i ($)] is given by the time- 
domain convolution: 

l h '(l) = (pt) 2 e ~L7 gi n # * O (1 - cos 6*) ; 

or 

h\D = ] o [(0t) 2 e i] sin0r] l c * [1 - cos b(t - t)]Ut, (]g) 

for t ^ 0. 
When this integration is carried through, the result is: 

/»:/(/) = (j\ [Im(P) - § Im(Q) - i Im(ff)]; t ^ (19) 
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As before, A 3 (0 is obtained by resupplying the amplitude factors and 
the delay T. By way of examining the form of h 3 (t), (19) has been evalu- 
ated for i? = 0.1 and 3.0. The resulting h 3 (t) is plotted in Fig. 8. Com- 
parison with the previous response for h^t) shows the results to be 
similar. 

1.2.3 Combined Frequency-Domain Responses 

The individual frequency-domain responses for (?(s) and F t (s) have 
been shown in Figs. 3 and 5 respectively. The combined response in the 
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Fig. 8 — Impulse responses for apical and basal points computed from [{/(/) 
*fAl)\. 

frequency domain is simply the sum of individual amplitude (in db) 
and phase (in radians) responses. The combined amplitude and phase 
responses for the model G(s)F i (s) are shown in Figs. 9 and 10, respec- 
tively. 

As already indicated by the impulse responses, one sees that the re- 
sponse of apical (low-frequency) points on the membrane is given es- 
sentially by F{s), while for basal (high-frequency) points the response 
is considerably influenced by the middle-ear transmission G(s). In 
particular, notice two things about the frequency response of the mem- 
brane model [i.e., F(u)]. One, the low-frequency skirt of the amplitude 
curve rises at about 6 db octave. And two, the phase of the membrane 
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Fig. 10 — Phase responses for the combined models [0(u)Fi(u)]. 

model [i.e., F(w)] approaches + ir/2 radians at frequencies below the 
peak amplitude response. In other words, at frequencies appreciably less 
than its peak response frequency, the membrane function F(u) behaves 
approximately as a differentiator. 

Because the middle-ear transmission begins to diminish in amplitude 
at frequencies above about 1500 cps, the membrane displacement in the 
basal region is roughly the time derivative of the stapes displacement. The 
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waveform of the impulse response along the basal part of the membrane 
is therefore approximately constant in shape. Along the apical part, 
however, the impulse response oscillates more slowly as the apex is ap- 
proached. This has already been illustrated in Fig. 7. [If the apical re- 
sponse is considered on a time scale normalized in terms of (/Si), then 
the displacement waveform is constant in shape.] This relation can, and 
has been, supported by psychoacoustic measurements. These results will 
be discussed in the second part of the paper. 

Notice one other thing from Fig. 9. Because the amplitude response of 
the middle ear declines appreciably at high frequencies, the amplitude 
response of a basal point is highly asymmetrical (for example, the com- 
bined response for 77 = .3.0.) The result is that a given basal point, while 
responding with greater amplitude than any other membrane point at 
its characteristic frequency (i.e., at 0i), responds with greatest amplitude 
(but not greater than some other point) at some lower frequency. 

1.3 Some Temporal and Spatial Relations For Membrane Displacement 

Certain results from physiological research 5 suggest that shear stresses 
along the basilar membrane may be as important in the mechanical-to- 
neural transduction as absolute displacements of the membrane. Ac- 
cordingly, the spatial derivative of the displacement may be the mechani- 
cal factor of consequence. The computational tractability of the model 
permits a straightforward consideration of some temporal and spatial 
relations for the displacement.* 

As a beginning, because they are easiest to talk about, consider only 
apical membrane points where the middle-ear transmission is essentially 
constant. In this case the displacement is nearly fit) [see (4) and (5)], 
and is only a function of / and the point parameter /3. The variable /3 is a 
function of the distance along the membrane and can be so specified. 
(This functional relationship will be developed presently.) The impulse 
response is essentially a function of the product @(t — T) and has a 
multiplicative factor involving /3 1+r (i.e., P l+r g[(3(t - T)]). This fact 
points up a simple aspect of the dispersive nature of the basilar mem- 
brane. 

If a disturbance is propagating in a nondispersive medium, the wave 
moves with a velocity which is the same for all frequency components, 
and the waveform is maintained undistorted. Let the wave for a one- 
dimensional situation be p(t,.r) = p(ct — x), where c is the velocity. 



* Sec the further discussion of spatial derivatives (displacement gradients) in 
Section II. 
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Then, 



dp 

at 



= c 



dp 



dp 



, dp 

d(ct -x)' dx d(ct - x)' 



(20) 



and the time and space derivatives have the same waveform. The cor- 
responding relations for the displacement responses of the membrane, 
however, must differ somewhat in time waveform. The model /a(0» 
Eq. (1.5), because of its simplicity, is particularly useful for illustrating 

this. 

Again neglecting the amplitude constants which do not involve /3 or t, 
and which can be resupplied in the result, fs(t) reduces to: 



where 



and 



Then, 



f-I'(t) = 1+r 9[(3(t - T)\- t ^ T 

QW ~ T)] = W - D] 2 ^"- T,/1 - 7 sin/3(/ - T), 

T = 3tt/4/3. 



(21) 
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Fig. 11 — Location of peak displacement of basilar membrane as a function of 
frequency (after Bekesy 3 ). 
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When the differentiation is carried out the result is: 
dp 



#•' = p r [p(t - r)]V^- r)/1 - 7 [fit cos fi(t - T) 



(23) 

+ (* + >+*&>-£)*«-*>]>'** 

As indicated earlier, the functional relation between the frequency fi 
and the distance along the membrane is needed to put (23) into the form 
of a space derivative. Bekesy gives data on the place of maximal dis- 
placement along the membrane as a function of frequency. These data 
arc replottcd in Fig. 1 1 . For purposes of the present discussion, the data 
for frequencies less than about 1000 cps are of main interest. If the 
basilar membrane is assumed to be 35 mm long, and if distance is now 
reckoned from the apical end, the low-frequency data are reasonably well 
approximated by: 

X = 7 - 5l ° gl0 40lr' (24) 

where x is the distance from the apex in mm. This line is drawn in Fig. 11. 
It is now easy to compute 
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= 0.310. 






Applying this result to (23) yields: 
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(25) 



(20) 



Except for the constant amplitude factor, this is the spatial derivative 
of the impulse response for apical membrane points. It is plotted in Fig. 
12. One notices its form is not radically different from the displacement. 

The time derivative follows directly from (21), and is: 

df" = Q \+r dg _ ■>+, dg^ 
dt dt dfit' 
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or, 

U_fl»+TB(f- T)]r 

at 



rn\l„-P(t-T)H.7 



+ 2- 



p(t - T) cosp(t - T) 
P(t ~ T) 



1.7 



sin/3(* - T) 



(27) 



; < ^ T. 



This function, except for amplitude factor, is the time derivative of the 
apical impulse response. It is plotted in Fig. 13. One notices that for a 
given (apical) point on the membrane, the time derivative of displace- 
ment is not greatly different in form from the spatial derivative. As men- 
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Fig. 12 — First spatial derivative of membrane displacement. 
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Fig. 13 — First time derivative of membrane displacement. 
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tioned earlier, the derivatives would have the same form if all frequency 
components propagated at the same velocity. 

It also is of interest to consider the frequency-domain correlate of the 
spatial derivative. In this case it is equally easy to begin in a general 
way and not initially restrict the discussion to the apical region. For the 
model Hi(s) the impulse response can be written in terms of its Laplace 
transform: 



h(t) = ^-. f Hi(s)e"ds, 



'-V 

where 

H 1 (s) = G(s)F i (s), (28) 

and where G(s) and Fi(s), the latter a function of the point parameter/?, 
have been specified previously. The spatial derivative, in terms of the 
frequency parameter /3, is therefore: 



Mi 



= ir G(s) su^i e . d , (29) 

l-wj J-j=o dp 

The quantity of interest is dh\/dj3. From the previous discussion: 

Taking, as earlier indicated, /? = 2a, 7 = /3, and e = 0, and carrying 
through the differentiation gives: 



9-mm>[ 



4 + r 



[0 + 2000tt) 

., (31) 
, 3irs _ 2(8 + 2.5/3) _ _1 

4/3 2 ' (s + 0.5/3 )* + ? s~+~/5j 



If we consider the result for real frequencies (i.e., s = ja>) and norma- 
lize frequency by letting f = o> '8, then (31) becomes: 

d l± = F (ta) } f [4/3 + 2000ir(4 + r)] 
dp lKSfi} p\_ (0 + 2000*-) 

(32) 



*r_ 2(2.5+ tf) 



4 (1.25 - r 2 -fir) 



1 + if J ' 



This can be put in terms of the spatial derivative (at least for apical 
points) by applying (25). If this is done, then the spatial derivative be- 
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Fig. 14 — Frequency domain correlate, F(f), of the first spatial derivative. 
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(33) 



= o.3f 1 (r > /8)y(r). 



For points lying in the apical half of the membrane, therefore, (i.e., 
j8 < 200&V) the frequency-domain representation of the spatial deriva- 
tive is simply Fi(f) (see Fig. 3) multiplied by the bracketed factor 
7(f) in (33). The phase and amplitude of this factor for « 2000tt 
are plotted in Fig. 14. 

One notices that for f < 1 the bracketed term, to a crude approxima- 
tion, is similar to a time differentiation. That is, the amplitude variation 
is 4-6 db/octave and the phase is 4- tt/2. This indication is consonant 
with the previous time-domain results shown in Figs. 12 and 13. 

1.4 An Electrical Circuit for Simulating Basilar Membrane Displacement* 

On the basis of the relations developed in the previous sections [Eqs. 
(6) and (30), for example], it is possible to construct electrical circuits 

* The material in this section was presented orally before the Sixty-Second 
Meeting of the Acoustical Society of America, Cincinnati, Ohio, November, 1961. 
The abstract appears in J. Acoust. Soc. Am., 33, 1961, p. 1670. 
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Fig. 15 — Electrical network representation of the model [G(s)Fi(s)]. 

whose transmission properties are identical to those of the functions 
G(s) and Frfs). This is most easily done by representing the critical 
frequencies in terms of simple cascaded resonant circuits. The additional 
phase delay can be supplied by means of an electrical delay line. A simu- 
lation of G(s) as given in (0) and F\(s) as given in (30) (for e = 0) is 
shown in Fig. 15. The voltage at an individual output tap represents 
the membrane displacement at a specified distance from the stapes. The 
electrical voltages analogous to the sound pressure at the eardrum and 
to the stapes displacement are also indicated. The buffer amplifiers 
labelled A have fixed gains which take account of the proper multiplica- 
tive amplitude constants. 

The circuit elements are selected according to the relations stated for 
G(s) and F t (s). [See Eqs. (.3) and (7).] For example, the procedure can 
be as follows. For the middle-ear simulation, choose a convenient R ', 
say 10K. Then, because b = 2a = 2ir-1500, and because a = \/R 'Cq, 

Co' = 0.02 M .f. 
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Select a convenient L , say 2h. Then, because 1.256 = \/L C and 

a = Ro/2Lo , 

Co = 0.005 m/, 

and 

Ru = 19K. 

The components for the basilar membrane networks are chosen in the 
same way. In this case: 



0i = -Jttm 1.25/tf = 



LtC t ' 



and 



Ri 



Consider, for example, the membrane point which responds maximally 
to 4500 cps (i.e., &i = 2x-4500). For convenience take Ri and L t as 10K 
and lh, respectively. Then: 

67 = 0.0035 nj 

Ci = 0.001 nf 

Ri - 28/v. 

For each membrane point the relative gains of the amplifiers are set to 
satisfy the amplitude relations implied in Fig. 9. This takes account of 
the constant multiplying factors in the model specification. 

Some representative impulse responses of the analog circuit of Fig. 15 
are shown in Fig. 10(a). One notices the degradation in time resolution 
as the response is viewed at points more apicalward. 

As indicated earlier, the spatial derivative may figure in the conversion 
of mechanical to neural activity. In previous psychoacoustic work 6 it 
was found useful to approximate the first spatial derivative by a finite 
difference. The present circuit can be used to provide such an approxi- 
mation by taking the differences between the deflections of adjacent, 
uniformly spaced points. Fig. 16(b) shows first-difference approximations 
to the spatial derivative obtained from the analog circuit by taking: 

dx ~ A:c ' 
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Fig. 16 — (ii) Impulse responses measured on the network of Fig. 15; (b) first 
difference approximations to the spatial derivative measured from the network 
of Fie. 15. 



with A.v = 0.3 mm. These responses can be compared (for apical points) 
with the calculated derivative in Fig. 12. Because of amplification, the 
polarity of the derivative traces in Fig. 10(b) is inverted from that shown 
in Fig. 12. 



II. SOME RELATIONS BETWEEN SUBJECTIVE AND PHYSIOLOGICAL BEHAVIOR 

The preceding discussion derived computationally tractable models 
for the operation of the middle ear and basilar membrane. Can these 
models be used to further our understanding of auditory subjective 
behavior? In particular can they help us to relate psychoacoustic phe- 
nomena to the physical operation of the peripheral ear? 

The models describe only the mechanical functioning of the ear. Any 
comprehensive hypothesis about auditory perception must make pro- 
visions for the transduction of mechanical displacement into neural 
activity. The details of this process are not well understood and the as- 
sumptions that, presently can be made must be of an approximate and 
simplified nature. Three such assumptions will be useful to us. Although 
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simplifications, they do not seem to violate known physiological 



facts. 

The first assumption (actually a fact) is that sufficient local deforma- 
tion of the basilar membrane elicits neural activity in the terminations 
of the auditory nerve at the organ of Corti. Such neural activity may 
be in the form of volleys triggered synchronously with the stimulus, or 
in the form of a signaling of place localization of displacement. Implicit 
in this is the notion that the displacement, or perhaps spatial derivatives 
of displacement, 5 must exceed a certain threshold before nerve firings 
take place.* The number of neurons activated depends upon amplitude 
of membrane displacement in a mono tonic fashion. Psychological and 
physiological evidence suggests that the intensity of the neural activity 
is a power-law function of the mechanical displacement. A single neuron 
is presumably a binary (fired or unfiled) device. It is refractory for a 
given period after firing; hence a limit exists upon the rate at which it 
can fire. Large populations of neurons, all of which are not refractory at 
the same time, can give rise to neural volleys at rates greater than the 
maximum rate of a single element. 

Second, neural firings occur on only one "polarity" of the displacement, 
or of the spatial derivative. 7 In other words, some process like half- 
wave rectification operates on the displacement function, or on its spatial 

♦Earlier, in Section 1.3, it was suggested that the spatial derivative of dis- 
placement, as well as the displacement, may be important, in the mechanical-to- 
neural conversion process. Further explication of this allusion and the present 
one is necessary. _ . ,., „ , T . 

Electrophysiological experiments on guinea pig [(.. von Bekesy, J. Acoust. 
Soc. Am., 25 (1953) p. 786; H. Davis, Ann. Oto. Rhin. Laryn. 67 (1958) p. 789.) 
suggest that the outer and inner hair cells of the organ of Colli differ in then- 
sensitivities to mechanical stimulation. The outer hair cells are sensitive to bend- 
ing only in a direction transverse to the long dimension of the membrane. Further 
than this, only outward bending of the hairs (away from arch of Corti) produces 
an electrical potential in the scala media favorable for exciting the auditory nerve 
endings. This outward bending is produced on upward motions of the basilar 
membrane — that is, motions which drive it toward the tectorial membrane and 
produce a relative shear. 

On the other hand, the inner hair cells, which reside between the arch of Corti 
and the axis of the cochlear spiral, are sensitive to bending in a direction parallel 
to the long dimension of the membrane. In this case only bending toward the apex 
of the cochlea produces a scala media potential favorable for stimulating the 
nerve. So far as a given point on the membrane is concerned, the inner hair cells 
are essentially sensitive to the longitudinal gradient of displacement — that is, 
to the spatial derivative in the long dimension. Furthermore, the inner cells fire 
only on that polarity of the gradient which corresponds to bending toward the 
apex. Threshold for firing of the inner cells apparently is about 20 db higher than 
that for the outer cells. 

If this behavior is common to the human ear, displacement gradient, as well as 
displacement may be significant. As the results of Section 1.3 show, the displace- 
ment and the spatial derivative have gross features which are similar. For this 
reason the hypotheses and arguments to be put forward in this section generally 
can apply equally to displacement and gradient. 
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derivatives. Third, the membrane point displacing with the greatest 
amplitude originates the predominant neural activity. (More strictly, 
perhaps, this is the point experiencing the greatest transverse and longi- 
tudinal bending.) The latter may also operate to suppress or inhibit 
activity arising from neighboring points. 

These assumptions, along with the results from the models, have in a 
number of instances been helpful in interpreting subjective auditory 
behavior. Without going into any case in great depth, let us consider 
several of these instances. 

2.1 Pitch Perception 

Pitch is that subjective attribute which admits of a rank ordering on a 
scale ranging from low to high. As such, it correlates strongly with ob- 
jective measures of frequency. One important facet of auditory percep- 
tion is the ability to assign pitch to sounds which exhibit time periodic- 
ity. 

Consider first the pitch of pure (sinusoidal) tones. For such stimuli the 
basilar membrane displacement at any point is sinusoidal. The frequency 
responses given previously in Fig. 9 indicate the relative amplitudes of 
displacement versus frequency for different membrane points. At any 
given frequency, one point on the membrane responds with greater am- 
plitude than all others. In accordance with the previous assumptions, 
the most numerous neural volleys are elicited at this maximum point. 
For frequencies sufficiently low (less than about 1000 cps) they are 
triggered once per cycle and at some fixed epoch on the displacement 
waveform. Subsequent processing by higher centers presumably ap- 
preciates the periodicity of the stimulus-locked volleys. For frequencies 
greater than about 1000 to 2000 cps, electro-physiological evidence sug- 
gests that synchrony of neural firings is not maintained. 9 Pitch is appar- 
ently perceived through a signaling of the place of greatest membrane 
displacement or displacement gradient. The poorer frequency resolution 
of points lying in the basal part of the basilar membrane probably also 
contributes to the psychoacoustic fact that pitch discrimination becomes 
less acute at higher frequencies. 10 

Suppose the periodic sound stimulus is not a simple sinusoidal tone 
but is more complex, say repeated sharp pulses. What pitch is heard? 
For purpose of illustration, imagine the stimulus to be alternately posi- 
tive and negative periodic impulses. Such a pulse train has a spectrum 
which is odd-harmonic. Pulse rate and fundamental frequency are in the 
ratio of two-to-one. If the pulses occur slowly enough, the membrane 
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Fig. 17 — Displacement responses for alternate positive and negative pulses 
simulated by the network of Fig. 15. 

displacements at all points along its length will resolve each pulse in 
time. That is, the membrane will have time to execute a complete, 
damped impulse response at all points for each pulse, positive or nega- 
tive. If, however, the fundamental frequency of the train is sufficiently 
high, the fundamental component will be resolved (in frequency) at the 
most apically responding point. This situation is illustrated by the traces 
in the first and second columns of Fig. 17. These waveforms were meas- 
ured on analog networks as illustrated in Fig. 15. The oscilloscope gain 
was adjusted for constant peak-to-peak amplitude to display the wave- 
forms more effectively. The proper relative amplitudes are therefore not 
indicated in the traces. 

For the low pulse-rate condition (25 cps fundamental) in the first 
column, one might imagine that neural firing synchronous with each 
pulse, regardless of polarity, would be triggered at all points along the 
membrane. The perceived pitch might be expected to be that of the pulse 
rate, and it is. 6 For such stimulation, the models indicate that the great- 
est membrane displacements occur near the middle portion, in the region 
maximally responsive to 1000 to 2000 cps. 

In the second column, the fundamental frequency is 200 cps. This is 
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high enough for the apical end of the membrane to resolve the funda- 
mental component. The displacement of the 200 cps point on the mem- 
brane is the fundamental sinusoid, while the more basal points continue 
to resolve each pulse in time. At the apical end, neural volleys might be 
expected to be triggered synchronously at the fundamental frequency, 
while toward the basal end the displacements favor firings at the pulse 
rate. For this condition, the apical fundamental-correlated displacements 
are generally of greater amplitude and subjectively more significant than 
the basal, pulse-rate displacements. The fundamental-rate volleys gen- 
erally predominate, and a pitch is heard corresponding to 200 sec -1 . 

If this same stimulus is high-pass filtered at a sufficiently high fre- 
quency, only the basal displacements remain effective in producing the 
pitch percept. If the present arguments continue to hold, this filtering 
should again give rise to a pulse-rate pitch because the time resolution 
in the basal end separates each pulse, whether positive or negative. 
Psychoaeoustic measurements show this in fact to be the case. 11 Repre- 
sentative membrane displacements for this condition, as given by the 
models, are shown in the third column of Fig. 17. 

A slightly more subtle effect is obtained if the high-pass filtering is 
made at a low harmonic number, for example, at the second harmonic 
so as to remove only the fundamental component. Under certain of these 
conditions the significant membrane displacement can be seen to exhibit 
displacements that favor fundamental-rate neural activity. The pitch 
percept would then be expected to be the fundamental, even though the 
fundamental is not present in the stimulus. Again psychoaeoustic meas- 
urements give this result. 12 The effect is the so-called residue pitch. 

Another of the many variations of pulse stimuli, but one which is 
diagnostically useful in exploring pitch perception, is the periodic, uni- 
polar impulse train in which the equispaced pulses have amplitudes 
(areas) alternately a and b. Such a stimulus exhibits an infinite number 
of complex spectral zeros, the imaginary parts of which occur at every 
other spectral line. The spectral envelope is cycloidal and is described 
by: 

K(s) = (a + be- sT, °-), (34) 

where T is the fundamental period. The spectral zeros lie at 

2, , .. , , .2(2n+ 1)t 

and the ratio of odd-line amplitude to even-line amplitude is: 

R= \K(*)\ r »*tT = Va^\ m (35) 
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Fig. 18 — Subjective pitch assigned to a periodic pulse train composed of sharp 
pulses with alternate amplitudes a and b. AL = 20 log a/b. The pitch equation is 
determined by matching the uniform pattern B to the test pattern A. 



One psychophysical question that could be posed is "For a given sen- 
sation level, fundamental frequency and a/b ratio, what is the pitch?" 
When this question is answered by means of a pitch-matching experi- 
ment, the result for several values of the variables is shown in Fig. 18. 
These results are for a sensation level of approximately 45 db. When the 
a/b ratio is less than about 4 db, one never hears any pitch except the 
pulse rate. On the other hand, when the a/b ratio is greater than about 16 
db, one never hears any pitch but one-half the pulse rate, i.e., the funda- 
mental. Between these level differences, a transition from one pitch mode 
to the other takes place. The transition depends upon fundamental fre- 
quency as shown in the figure. As in the previous case, calculations and 
observations with the analog networks show the correlation between 
these modes and the displacement patterns of the basilar membrane. 
Unlike the situation depicted previously in Fig. 17(b), however, a pitch 
percept equivalent to half the pulse rate does not necessarily mean that 
the fundamental frequency is resolved by the membrane. 

A somewhat different example of pitch stimulus is periodically inter- 
rupted random noise. Under certain conditions of interruption rate, duty 
factor and frequency content, chopped noise possesses a pitch. It is 
relevant to consider how such a signal is represented in the mechanical 
displacements of the basilar membrane. 

Because of the ear's frequency characteristics, a broad-spectrum noise 
would be expected to produce the greatest displacements somewhere 
near the middle of the membrane, around the 1000 to 2000 cps point. 
Let us look at these displacements for a flat-spectrum noise which is 
chopped with constant duty factor of 0.2. The response waveforms for 
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Fig. 1!) — Displacement responses simulated by the network of Fig. 15 for 
periodically interrupted noise. Constant duty factor = 0.2. 

several interruption rates are given in Fig. 19. For the slow rate, 500 
sec -1 , the noise bursts arc well resolved in time. Nerve volleys synchro- 
nous with the onset of the noise bursts might be expected for this stimu- 
lation. As the interruption rate is increased to upwards of 2000 sec - , 
however, neither the stapes nor membrane displacements resolve each 
burst separately in time. Stimulus-locked synchrony of the neural ac- 
tivity might be expected to be impaired or lost, even if the neural volleys 
could be elicited at this rate. Psychoacoustic observations bear this out. 
They show that it is difficult to ascribe a pitch to interrupted noise for 
rates greater than about 1000 sec -1 even under favorable conditions of 
low duty factor. It is not clear how much of this limit is determined by 
neural resolution, and how much by mechanical. Very likely both fac- 
tors contribute lo the resultant behavior. 

2.2 Binaural Lateralization 

Another aspect of perception is binaural lateralization. This is the sub- 
jective ability to locate a sound image at a particular point inside the 
head. The phenomenon is conventionally observed in earphone listening. 
If identical clicks (impulses of sound pressure) are produced simultane- 
ously at the two ears, the average listener hears the sound image to be 
located in the center of his head. If one click is produced a littler earlier 
(or with slightly greater intensity) than the other, the sound image shifts 
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toward the earlier (or more intense) ear. This shift continues with in- 
creasing time difference until the image moves entirely to one side and 
eventually breaks apart. One then hears individual clicks located at the 
ears. 

Naively we suppose the subjective position of the image to be deter- 
mined by some sort of computation of coincidence between neural 
volleys. The volleys originate at the periphery and travel to higher 
centers via synaptic connections. The volley initiated earliest progresses 
to a point in the neural net where a coincidence occurs with the later 
volley to produce a subjective image appropriately off center. To the 
extent that intensity differences can shift the image position, intensity 
probably is coded, at least partially, in terms of the volley timing. As 
was the case in pitch perception, there are several areas in binaural 
phenomena where the ear models have been helpful in suggesting ex- 
planations of, and correlations with, subjective behavior. One of these 
areas is the effects of phase upon the binaural lateralization of clicks. 

Suppose one produces impulses of pressure at the two ears, of equal 
intensity but of opposite polarity (i.e., one a rarefaction and the other 
a condensation). How would a listener adjust the times of occurrence 
of such pulses so that he hears the sound image exactly in the center of 
head? Let us consider what the displacement waveforms and the me- 
chanical-to-ncural conversion hypotheses would predict. 

An impulse of pressure rarefaction draws the eardrum and stapes 
initially outward and causes the membrane displacement to be initially 
upward. A condensation pulse, on the other hand, causes an initially in- 
ward displacement of drum and stapes and consequently an initially 
downward movement of the membrane. At any given point on the mem- 
brane the waveforms of displacement produced by these two stimuli 
differ only in sign; that is, one is the negative of the other. Typical dis- 
placements of apical and basal points caused by rarefaction and conden- 
sation pulses are shown in Fig. 20. (These traces are essentially the im- 
pulse responses calculated previously in Fig. 7.) 

The top diagram in Fig. 20 is illustrative of the displacement response 
of points lying in the apical (low-frequency) half of the membrane. The 
solid curve is the displacement for a rarefaction pulse, the dashed for a 
condensation. The abscissa at the top is in terms of the product 0t, where 
(3 is the radian frequency of maximal response for the particular apical 
point. The lower abscissa on the top graph shows time scales appropriate 
to the specific points maximally responsive to 1200 and 600 cps, respec- 
tively. The lower graph shows the displacement appropriate to points 
lying in the basal (high-frequency) half of the membrane. As discussed 
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Fig. 20 — Apical and basal displacements of the basilar membrane for rarefac- 
tion and condensation pressure impulses at the eardrum. These responses from 
those computed in Fig. 7 for [G(s)/''i(s)]. 

earlier, this waveform has essentially the same shape and time scale for 
all basal points. 

Following our earlier assumptions, we suppose that neural firings (at 
least of the more sensitive outer hair cells) take place at some amplitude 
level on the upward deflections of the membrane. The curves suggest, 
therefore, that a time difference should exist between the firings for a 
rarefaction pulse and those for a simultaneous condensation pulse. The 
difference should be about one-half cycle on the displacement waveforms. 
The earlier results indicate that for broad-spectrum excitation the great- 
est deflections occur near the middle of the membrane, in the vicinity 
of the region maximally responsive to 1000 to 2000 cps. For such a place, 
the half-cycle intervals are of the order of 250 to 500 jusec. The time 
scale for the 1200 cps point in the top graph is indicative of this magni- 
tude. 
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Assuming that simultaneity of neural firings at the two ears produces 
a central sound image, a rarefaction pulse and a condensation pulse 
should produce a centered image if the condensation is advanced in time 
to bring its positive displacement peak approximately into coincidence 
with that of the rarefaction. This means advancing the condensation, or 
letting it lead, by about 250 to 500 yusec. Furthermore, the periodic nature 
of the displacements suggests that multiple fusions of the sound image 
might occur by virtue of neural firings triggered at secondary positive 
excursions. These should occur for interaural times which are full-cycle 
increments of the principal fusion, including lead and lag shifts. A half- 
cycle lead of the condensation would represent a principal fusion; a half- 
cycle lag would be a secondary fusion. 

If cophasic pulses are delivered to the two ears, that is, rarefaction- 
rarefaction or condensation-condensation, the same argument says that 
the principal fusion should obtain for zero interaural time difference, and 
secondary fusions for full-cycle shifts, either lead or lag. 

The preceding remarks relate to broadband, unmasked pulses, where 
the neural response is likely to originate near the central portion of the 
basilar membrane. Suppose the dominant response were elicited from 
some other place on the membrane. The interaural time difference for 
lateralization ought to change in accordance with what the curves in 
Fig. 20 imply. Band-filtering of the pulse stimuli is an obvious means for 
confining membrane activity to specific regions. This has the disadvan- 
tage, however, that the stimulus signal is contaminated with the impulse 
response of the filter, so that it is inconvenient, if not difficult, to analyze 
the membrane displacement. The objective can be achieved more con- 
veniently by selectively masking the membrane response with filtered 
random noise. The significant neural information can then be originated 
in a normally less responsive region by obscuring the maximally re- 
sponding place with noise. 

The top graph in Fig. 20 suggests that if the disparity between the 
interaural times for lateralizing cophasic and antiphasic pulses is to be 
increased, the significant response must originate from places more api- 
calward, that is, at points which respond maximally at lower frequencies. 
In such a case high-pass (HP) noise should be used to obscure activity 
in the basal part of the membrane. Low-pass (LP) noise, on the other 
hand, causes the coherent information to arise from the basal section. 
Here, because of the nature of the pulse response, the disparity between 
cophasic and antiphasic fusions is predicted to be roughly constant with 
place along the membrane, and should be of the order of 250 usee. This, 
too, ought to be the minimum interaural disparity that can be produced 
for the antiphasic situation. 
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An experiment was performed to determine whether the predicted 
phenomena are in fact manifest. 14 The arrangement to measure the effects 
is shown in Fig. 21. Twin pulse generators produced identical 0.1-msec 
rectangular pulses in separate channels at a sensation level of 40 db. 
The repetition rate of the pulses was 10 sec -1 . Random noise from two 
uncorrelated generators was filtered by identical filters and added to the 
signal channels. This noise level completely masked the selected portions 
of the pulse spectra. HP and LP noise cutoff frequencies of 600, 1200, and 
2400 cps were used in addition to no masking. Condenser microphones 
fitted with ear-insert plugs were used as earphones to provide good trans- 
duction of the pulse signals. The subject was provided a delay control 
which permitted continuous adjustment of the time of occurrence of one 
pulse relative to the other over the range ±5 msec. A switch could re- 
verse the polarity of the pulse delivered to one earphone. 

The results of this experiment for three listeners are summarized in 
terms of median responses in Fig. 22. For HP masking, Fig. 22(a), the 
interaural time for the principal antiphasic lateralizations is seen to 
increase as the cutoff frequency of the HP noise is lowered. For these 
conditions the maximally responding unmasked place on the membrane 
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Fig. 22 — Effects of masking upon the lateralization of cophasic and antiphasic 
pulses. 



should be that just below the cutoff frequency, f c , of the filter. The inter- 
aural time for the antiphasic fusion ought then to be about ±l/2/ c . 
The data follow this value reasonably well. 

The secondary antiphasic fusions (condensation lag) are roughly a 
reflection of the principal ones in the .r-axis. The time separation between 
the principal and secondary points is approximately the predicted full- 
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cycle shift, or about l//„ . The principal cophasic fusions fall along the 
axis for zero interaural times, and the secondary cophasic fusions (high- 
est and lowest curves) fall at about the right value for the full-cycle 
shift. 

The results for the LP masking, Fig. 22(b), indicate that the hypothe- 
ses about fusions of basal-end information are essentially sustained. The 
cophasic-antiphasic disparity is roughly constant at about 300 /xsec. 
Secondary images, however, are not easily heard because the LP noise is 
a more potent out-of-band masker. Both masking conditions make it 
clear that the significant neural timing information can be made to origi- 
nate from different points along the membrane. Further, the neural 
timing is intimately related to the individual mechanical excursions of 
the membrane at the significant point. 

Some electrophysiological evidence also exists to support these psycho- 
acoustic results and the assumptions made earlier. Peake 15 measured the 
latency of the gross neural component, Ni , in cat's ear for stimulation 
by rarefaction and condensation pulses. For moderately high signal 
levels, the difference in latencies is found to be of the order of 200 to 300 
/isec with condensation pulses giving the greater latencies. In addition, 
very recent data by Kiang 7 on the activity of single, peripheral nerve 
units suggest that the firings are synchronized with the individual uni- 
polar displacements of the membrane, as conjectured here. 

2.3 Time- Intensity Trade 

In other binaural experimentation it has been observed that the position 
of a sound image can be maintained stationary by trading relative in- 
tensity against relative time of occurrence of pulses at the two ears. That 
is, the movement of the sound image towards the ear receiving a leading 
click can be offset by an increase in intensity of the pulse at the lagging 
ear. To a certain extent such a trading relation is implied in the calcu- 
lated membrane responses and in the simple hypotheses about conver- 
sion of mechanical to neural activity. It is worth considering the degree 
to which the experimentally observed trade can be explained by the 
membrane relations. 

Consider again binaural excitation of the ears by short, unipolar pulses. 
The earlier assumptions about neural firings on upward displacements 
of the membrane, in excess of a fixed threshold, imply a time-intensity 
trade. For ease in illustration, consider that the form of the impulse re- 
sponse of the membrane for middle to apical points is essentially speci- 
fied by the model fz (t) given in (5). For simplicity this can be written 
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without constants as 

MB) = (0)V /17 sin 0, (36) 

where 6 = fit. The waveform of this function has already been plotted 
in Fig. 4. 

Imagine that the amplitude of a rarefaction stimulus has been set to 
threshold value so that neural firings are produced exactly at the first 
positive crest of the displacement. Now if the intensity of the stimulating 
pulse is steadily increased, the threshold level will be crossed at succes- 
sively earlier times on the initial quarter cycle of the wave. In terms of 
the model, the advance in time of the threshold crossing is a simple 
function of the stimulus amplitude, and we can compare it with experi- 
mentally measured figures. 

For reasons that will be obvious presently, we take 

In / 3 (0) = 2 In - pr + In sin 0. (37) 

Differentiating with respect to 0, 

d[ln/ 3 (0)] = 2_ 1 m 

dd 1.7 

and taking the partial with respect to time, 

«nA<»)j_,/»_ i + cot# y (39) 



dt \0 1.7 

Equation (39) gives, in effect, the time-intensity trade for the wave in 
terms of nepers per second, as a function of the epoch at which thresh- 
old is crossed. In psychoacoustic tests the trade has customarily been 
specified in terms of msec/db — that is, the number of milliseconds by 
which the stimulus in one ear must be advanced to offset a relative 
intensity increase of one db in the other ear. Equation (39) can be put 
in terms of msec/db by taking its reciprocal, and multiplying by 10~ /8.7. 
One often sees this trade plotted as a function of intensity or sensation 
level of the stimulus. Let us arbitrarily take the positive maximum, 
MO+umx), as the threshold level of displacement. An increase in intensity 
of X db will then cause the threshold to be crossed at an epoch, ^ 

^ +max that satisfies: -8.7 In JiK = X db. We can therefore 

J3\v+max) 

plot (39) [converted to msec/db] versus (37) [converted to db re 
/3( 0+max)] for common values of the threshold crossing 0. This function, 
for three different apical points on the membrane, is shown in Fig. 23(a). 
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Fig. 23 — Time-intensity trade predicted for (a) apical points by membrane 
model fi(.t); (b) for basal points by stapes derivative g(t). 

The curves suggest that the trade of msec/db is greatest for low signal 
levels and diminishes for higher levels. It also indicates that the trade 
in msec/db is greater for points closer to the apex, that is, for lower 0. 
Broadband pulse excitation of the model, as previously stated, produces 
greatest displacements near the middle of the membrane. The 1000-cps 
point is representative of this region. Low-level values of the trade for 
this point are on the order of 0.03 msec/db. 
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The earlier arguments also indicated that the impulse responses of 
basal points were similar to the time derivative of stapes displacement. 
A time-intensity trade computed from these responses ought to be 
suggestive of the minimum msec/db value that could be expected if the 
trade were based upon basal activity. We can use (11) and approxi- 
mate the basal displacements by the stirrup derivative, g(t). Letting 
(bt) = *, 



And, 



In g($) = -| + In (2 sin * + cos * - 1). 



at 



_ 2(2 sin$ + cos$ - 1) 
d In g(<&) b(cos $ — 3 sin <t> -f- 1) " 



(40) 



(41) 



Again, expressing (40) in db relative to g(*+,„ a x) and (41) in msec/db, 
the two can be plotted for common values of 3>. When this is done the 
trading relation obtained is shown in Fig. 23(b). Because of the sub- 
stantial asymmetry in g($), the function is computed for the initial 
quarter of the positive deflection and initial quarter of the negative 
deflection (that is, the initial positive deflection if the displacement 
phase were reversed). The former would be appropriate for rarefaction 
pulse excitation; the latter for condensation. These figures are, of 
course, susceptible of the uncertainty connected with the value b and 
the approximation of the basal displacements by g($). Nevertheless, 
the trading values thus obtained fall reasonably close to those for the 
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1000-cps point shown in Fig. 23(a). To the extent that g(t) is a reasona- 
ble approximation of the basal displacements, the results indicate that 
the trade for condensation pulses should be slightly greater than those 
for rarefactions. 

Some psychoacoustic data are available for comparison with these 
calculations. David, Guttman and van Bergeijk 16 used 2-kc high-pass 
clicks to measure the trading function and obtained a median result 
shown in Fig. 24. Harris 17 used both HP and LP filtered pulses and pure 
tones in a related investigation. .Several of his results arc also shown in 
Fig. 24. The subjective data for the HP pulses are clearly greater than 
the predictions from the model. The results for 1400 LP, however, are 
more nearly of the magnitude suggested by the computations. The 
previous computations also suggest that the msec/db trade should 
increase in magnitude as the significant neural information is elicited 
from more apical (low-frequency) points. Harris found, however, that 
for LP clicks with cutoffs between 200 to about 1000 cps, the trade 
was nearly constant at about 0.03 msec/db. One difficulty in comparing 
the computed and measured data is that we do not know how to equate 
values on the abscissae of Figs. 23 and 24. That is, we do not know 
what sensation level corresponds to the zero-db reference amplitude of 
the displacement wave. Only general directions and trends can there- 
fore be legitimately compared. 

Another difficulty in comparing the data is that the computed time- 
intensity trades assumed ideal impulse excitation of the ear. The ex- 
perimental measurements, on the other hand, used pulses which were 
HP or LP filtered. The effect of the filter response upon that of the 
membrane is somewhat uncertain. The experimental determinations 
and the computations may not therefore be strictly comparable. To 
attempt to obviate this difficulty, we made some cursory measurements 
of the trade using the masking technique described earlier in Section 2.2. 
For a sensation level of 40 db, and with unmasked rarefaction clicks, 
one trained subject from the previous lateralization experiment made 
the At — M swap plotted as the lower curve in Fig. 25. A binaural 
masking by 600-cps HP noise presumably constrains the coherent 
neural activity to come from a more apical point (somewhere near the 
GOO-cps point). For such a masking the same subject made the trade 
indicated by the upper curve, giving values about twice as great as the 
unmasked trade. The slope of the unmasked function at the origin is 
approximately 0.03 msec/db. That for the 600 HP masking is about 
0.05 msec/db. 

Clearly these data for one subject are tentative and must be confirmed 
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Fig. 25 — Effects of masking upon time-intensity trade for broadband, co- 
phasic pulses. 



or refuted by additional experimentation. To the extent that they are 
correct, however, they support the general predictions of the model as 
to the frequency (or membrane place) dependence of the time-intensity 
trade. They do not agree well with absolute magnitudes of the computed 
trade, and there is still the question of how to equate abscissae. It is 
highly probable, too, that the time-intensity trade involves neural 
mechanisms not here included. Even so, the mechanical operations ap- 
pear to go a long way in contributing to an explanation of the phenom- 
enon. A time-intensity trade has also been observed at the neural level. 
In the cat's ear, Peake 15 finds that an intensity change of about 40 db 
in a stimulating pressure click causes a reduction in the latency of the 
Ni neural component by roughly 0.6 msec. A simple division of these 
figures gives 0.01 o msec/db for the trade. This figure falls within the 
range predicted by the model for human hearing. 

As a final point in this theme, the same arguments can be made for 



BASILAR MEMBRANE DISPLACEMENT — PART II 



1001 



pure tones. For such stimuli the membrane displacements are also sinus- 
oidal (at least over a large intensity range) and have the form 



fff(t) = Kn(u}) sin cot, 



(42) 



where K&{co) is an amplitude versus frequency factor appropriate to 
the membrane point maximally responsive to radian frequency /?; 
Kb(co) is largest, of course, for co = (3. The previous argument gives 



and, 



In/^/) = (In sin cot + In Kg), 



— (In fg) = w(cot cot), 
at 



(43) 



(44) 



A plot of this last result in terms of msec/db versus amplitude in db 
(relative to the amplitude for threshold crossing at cot = x/2) is shown 
in Fig. 26. In order of magnitude and frequency dependence, these 
values seem to compare reasonably well with results of Harris for 200- 
and 500-cps pure tones, previously shown in Fig. 24. 
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Fig. 26 — Time-intensity trade predicted for sine wave stimuli. 
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2.4 Threshold Sensitivity 

The combined response curves in Fig. 9 indicate that the ear is more 
sensitive to certain frequencies than to others. This is well known to be 
subjectively true. To what extent, then, are the variations in the thresh- 
old of audibility accounted for by the mechanical sensitivity of the ear? 
We can use the model responses to examine the question. 

The envelope of the peak responses in Fig. 9 can be compared with 
the subjectively determined minimum audible pressure for pure (sine) 
tones. Fig. 27 shows this comparison. The agreement is quite poor, al- 
though the gross trends are similar. The model responses here are on the 
basis of a 1500-cps critical frequency for the middle ear. The earlier dis- 
cussion has pointed up the uncertainty of this value. The middle-ear 
critical frequency chosen to illustrate the computational technique was 
that derived from Zwislocki's data. The latter, in turn, were based upon 
one of Bekesy's investigations. In other investigations, Bekesy also 
found middle ear cutoffs higher than 1500 cps, so some uncertainty 
exists as to where this number should be fixed. Obviously the choice of 
this constant does not alter the computational method or analytical 




o 

100 



200 400 600 1000 2000 4000 6000 10,000 
FREQUENCY IN CYCLES PER SECOND 



Fig. 27 — Relation of mechanical sensitivity of model to subjective monaural 
threshold for pure tones. 
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Fig. 28 — Average number of ganglion cells per mm length of organ of Corti 
(after Guild et nl u ). 

technique. If we choose instead a critical frequency of 3000 cps for the 
middle ear, the fit to the threshold curve at high frequencies is more 
respectable. The match at low frequencies, however, is not improved, 
but we are less concerned about this for a different reason. 

For the low frequencies, the disparity between mechanical and sub- 
jective sensitivity probably is a neural effect. According to our earlier 
assumptions, the number of neurons activated bears some monotonic 
relation to amplitude of membrane displacement. Perception of loudness 
is thought to involve possibly temporal and spatial integrations of 
neural activity. If a constant integrated activity were equivalent to 
constant loudness, the difference between mechanical and subjective 
sensitivities might be owing to a sparser neuron density in the apical 
(low-frequency) end of the cochlea. There is physiological evidence to 
this effect. 

In histological studies Guild et al 18 counted the number of ganglion 
cells per mm length of the organ of Corti. Their results for normal ears 
are summarized in Fig. 28. These data show a slight decrease in the 
number of cells at the basal end and a substantial decrease in the density 
as the apex is approached. The innervation over the middle of the mem- 
brane is roughly constant. 

One can pose the same questions about threshold sensitivity for short 
pulses or clicks of sound. For brief pulses of sufficiently low repetition 
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rate, the maximal displacements of the membrane, as stated before, are 
near the middle. According to the model, this continues to be the case 
for pulse rates well in excess of several hundred per second. The resonance 
properties of the membrane in this region are such as to resolve in time 
each individual exciting pulse. If, then, the predominant displacement 
takes place at one point for a large range of pulse rates, polarity pat- 
terns, and pulse durations, how might the subjective threshold vary 
and how might it be correlated with the membrane motion. One investi- 
gation of this question has led to a model for pulse threshold loudness. 19 
These results can be partially summarized. 

Thresholds of audibility for a variety of periodic pulse trains with 
various polarity patterns, pulse rates and durations are shown in Fig. 
29. One notices that the thresholds are relatively independent of polarity 
pattern. For pulse rates less than 100 pps, the thresholds are relatively 
independent of rate, and dependent only upon pulse duration. Above 
100 pps, the thresholds diminish with increasing pulse rate. Amplitude 
of membrane displacement would be expected to be a function of pulse 
duration and to produce a lower threshold for the longer pulses, which 
is the case. For rates greater than 100 sec" , however, some other non- 
mechanical effect apparently is of importance. The way in which audible 
pulse amplitude diminishes suggests a temporal integration with a time 
constant, of the order of 10 msec. 

Using the earlier assumptions about conversion of mechanical to 
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neural activity, one might ask "What processing of the membrane dis- 
placement at the point of greatest amplitude would reflect the constant 
loudness percept at threshold?" A possible answer is suggested by the 
operations illustrated in Fig. 30. 19 The first two blocks represent middle- 
ear transmission [as specified in ((})] and basilar membrane displacement 
[vicinity of the 1000-cps point, as specified in (31)]. The diode represents 
the half-wave rectification associated with neural firings on unipolar 
motions of the membrane. The RC integrator has a 10-msec time con- 
stant, as suggested by the threshold data. The power-law element (ex- 
ponent = 0.6) represents the power-law relation found in loudness 
estimation.* A meter indicates the peak value of the output of the power- 
law device. When all stimulus conditions represented by points on the 
threshold curves in Fig. 29 are applied to the circuit, the output meter 
reads the same value: that is, threshold. 

One can also notice how this model might be expected to perform for 
sine wave inputs. Because the integration time is 10 msec, frequencies 
greater than about 100 cps produce meter readings proportional to the 
average value of the half-wave rectified sinusoid. In other words, the 
meter reading is proportional to the amplitude of the sine wave into the 
rectifier. Two alterations in the network circuitry are then necessary. 
First, the basilar membrane network appropriate to the point maximally 
responsive to the sine frequency must be used. This may be selected 
from an ensemble of networks. And second, to take account of the sparser 
apical innervation, the signal from the rectifier must be attenuated for 
the low-frequency networks in accordance with the difference between 
the mechanical and subjective sensitivity curves in Fig. 27. The power- 
law device still operates to simulate the appropriate growth of loudness 
with sound level. 



* The power-law device is not necessary for threshold indications of "audible- 
inaudible." It is necessary, however, to represent the growth of loudness with 
sound level, and to provide indications of subjective loudness above threshold. 
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2.5 Pure-Tone Masking 

Masking is defined as the increase in the threshold of audibility of one 
sound caused by the presence of another. The models with which we 
have been dealing describe the mechanical frequency sensitivity of the 
ear and hence ought also to imply something about the masking of one 
pure tone by another. 

The significant neural information for a pure-tone stimulus is assumed 
to come from the membrane point which responds maximally (mechan- 
ically) to that frequency. The ability to detect activity correlated with 
such a tone ought likewise to be related to the amplitude of displacement 
produced at this same point by any interfering (masking) sound. In 
other words, the relative amplitudes of displacement caused at the point 
by the tone and masker might be expected to be related to the shift in 
threshold of the tone. From the model we can determine the relative 
amplitudes produced by tone and masker at the membrane point which 
responds maximally to the tone. For low frequencies, where middle-ear 
attenuation is not appreciable, this can be done simply from the mem- 
brane response curves such as shown in Fig. 3. 

Let us take, for example, a masker of 400 cps (because there are 
corresponding subjective data for this condition). The relative levels 
of maskee and masker are shown by the dashed curve in Fig. 81. (These 
levels are read on the righthand ordinate.) Subjective threshold meas- 
urements for the same conditions produce the solid curve. 2 One sees 
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Fig. 31 — Masking of one tone by another (a) predicted by the model; and (b) 
measured by Egan and Hake. 20 
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that the agreement is not particularly close, although the curves have 
similar gross shapes. The psychoacoustic measurement indicates less 
masking at frequencies removed from the masker than the mechanical 
response implies. This might suggest at least two possibilities: one, 
that the upper and lower skirts of the membrane resonances are a 
little steeper than we think; or two, that when the maskee has a relative 
level as much as about 10 db or so greater than the masker (at the 
maskee point) some neural inhibitory mechanism functions to suppress 
the masker even more. 

One notices irregularities in the subjective masking curve at frequen- 
cies where the tone is an integral multiple of the masker. These arc 
produced by beats and subjectively generated harmonics. One notices, 
too, that when tone and masker are the same frequency, the measure- 
ment is essentially a determination of the intensity limen. For example, 
the masking at 400 cps is 40 db, which means that a 400-cps tone must 
be raised 40 db above its unmasked threshold to be just audible in 
the presence of another 400-cps tone at a sound-pressure level of 80 db 
(re 0.0002 dyne/cm 2 ). The unmasked threshold (minimum audible 
pressure) for a 400-cps tone is approximately 10 db spl (see Fig. 27). 
The maskee is just detectable, therefore, when its level is about 50 
db spl, or 30 db less than the masker. For an in-phase (or out of phase) 
condition, the maskee could maximally increase (decrease) the inten- 
sity of the masker by about 0.3 db. This is roughly the size of the in- 
tensity limen measured at this sensation level. 21 

When this same masking comparison is made for higher frequencies, 
the middle-ear transmission must be considered. If the middle-ear cutoff 
used in the model calculations is used, the agreement between subjective 
and mechanical results is poor at high frequencies. This again argues 
that the normal critical frequency for the middle ear is somewhat higher 
than that used to illustrate the model calculations. 

The mechanical response also shows why a lower-frequency tone is a 
more effective masker than a higher-frequency tone. The reason is simply 
that the frequency response of a given point on the basilar membrane 
has a low-frequency skirt less steep than its high-frequency skirt. This 
same fact also suggests why low-frequency hearing is so difficult to 
impair by local injury or disease in the ear. The shallow low-frequency 
skirts of the response of all points along the membrane show that even 
basal points can respond appreciably to low-frequency stimuli. Even 
if the apical end of the basilar membrane were destroyed, basal locations 
could provide some low-frequency response. 

Essentially in the same vein, these relations suggest why high-fre- 
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quency hearing is so susceptible of impairment. The high-frequency 
skirt of the frequency responses is quite sharp. Damage to any basal 
location leaves no other point capable of responding substantially to 
that frequency. 

2.6 Conclusion 

It seems clear that the extent to which subjective behavior can be 
correlated with, identified in, and predicted by the mechanical operation 
of the peripheral ear is rather substantial. The models developed here 
have been found to be useful computational tools in the analyses of a 
number of different psychoacoustic problems. They have, in fact, pre- 
cipitated several experiments by predicting hearing phenomena which 
were later confirmed by the experiments. Further, electrophysiological 
data obtained recently link neural activity intimately with the individual 
mechanical excursions of the membrane. These findings also lend support 
to the simple assumptions about the conversion of mechanical to neural 
information. 

The models do not, of course, account for higher-order neural functions 
and hence describe only a peripheral part of the hearing process. Even 
so, they seem in many cases to contribute substantially to physiological 
explanations of subjective behavior. As more knowledge is gained about 
mechano-neural conversion and about neural processing, analytical 
specification of the mechanical operation, such as developed here, may 
prove increasingly useful. 
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